Learning Näıve Bayes Tree for Conditional Probability Estimation
نویسندگان
چکیده
Näıve Bayes Tree uses decision tree as the general structure and deploys näıve Bayesian classifiers at leaves. The intuition is that näıve Bayesian classifiers work better than decision trees when the sample data set is small. Therefore, after several attribute splits when constructing a decision tree, it is better to use näıve Bayesian classifiers at the leaves than to continue splitting the attributes. In this paper, we propose a learning algorithm to improve the conditional probability estimation in the diagram of Näıve Bayes Tree. The motivation for this work is that, for cost-sensitive learning where costs are associated with conditional probabilities, the score function is optimized when the estimates of conditional probabilities are accurate. The additional benefit is that both the classification accuracy and Area Under the Curve (AUC) could be improved. On a large suite of benchmark sample sets, our experiments show that the CLL tree outperforms the state-of-art learning algorithms, such as Näıve Bayes Tree and näıve Bayes significantly in yielding accurate conditional probability estimation and improving classification accuracy and AUC.
منابع مشابه
A Comparative Analysis of Methods for Probability Estimation Tree
In this paper, we address the problem of probability estimation of decision trees. This problem has received considerable attention in the areas of machine learning and data mining, and techniques to use tree models as probability estimators have been suggested. We make a comparative study of six well-known class probability estimation methods, measured by classification accuracy, AUC and Condi...
متن کاملConditional Independence Trees
It has been observed that traditional decision trees produce poor probability estimates. In many applications, however, a probability estimation tree (PET) with accurate probability estimates is desirable. Some researchers ascribe the poor probability estimates of decision trees to the decision tree learning algorithms. To our observation, however, the representation also plays an important rol...
متن کاملBayes Networks and Fault Tree Analysis Application in Reliability Estimation (Case Study: Automatic Water Sprinkler System)
In this study, the application of Bayes networks and fault tree analysis in reliability estimation have been investigated. Fault tree analysis is one of the most widely used methods for estimating reliability. In recent years, a method called "Bayes Network" has been used, which is a dynamic method, and information about the probable failure of the system components will be updated according to...
متن کاملBayesian network classifiers which perform well with continuous attributes: Flexible classifiers
When modelling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous works have solved the problem by discretizing them with the consequent loss of information. Another common alternative assumes that the data are generated by a Gaussian distribution (parametric approach), such as conditional Gaussian networks, wit...
متن کاملAn Empirical Study on Class Probability Estimates in Decision Tree Learning
Decision tree is one of the most effective and widely used models for classification and ranking and has received a great deal of attention from researchers in the domain of data mining and machine learning. A critical problem in decision tree learning is how to estimate the classmembership probabilities from decision trees. In this paper, we firstly survey all kinds of class probability estima...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005